*The (zipped) file "raw_data.7z" contains two files that can be used for replication:
	*"token_ids_corpora_noPUNCT.csv" - this file was used in the main part of the paper. For this version punctuation and CARD numbers were excldued by deleting all tokens that only contain non-letters
	*"token_ids_corpora.csv" - this file was used in the appendix . Here, punctuation and numbers are included.
*For copyright and license reasons, the data only contain token_ids instead of the actual word types. The variable token_id is a unique identifier of each word type. 
*Regarding the actual word types that correspond to each token_id , please contact the head of the corpus linguistics department at IDS Marc Kupietz (kupietz@ids-mannheim.de)
*The csv file contains: 
**the date information (month/year)
**a variable called "token_id_original" that corresponds to the original order of the word tokens
**ten variables called "token_id_random_1-token_id_random_10" that correspond to the ten different random arrangements of the order of all articles as mentioned in the paper.
*For the Litmus test, the variable "token_id_random_1" was used


*The (zipped) file code.7z contains the Stata and R code that can be used for replication
*The (ziiped) file R_results.7z contains the results of the the replication with R
***********************************
*Contact: koplenig@ids-mannheim.de

